OCR vs ADE: Mechanisms Behind the Methods
dev.toยท1dยท
Discuss: DEV
๐Ÿ“„OCR
Show HN: Lore Engine โ€“ Turn 10-hour lectures into 2 hours of comprehensive notes
github.comยท1dยท
Discuss: Hacker News
๐Ÿ“„Document Streaming
The Dunhuang Culture ๆ•ฆ็…Œๆ–‡ๅŒ– Database
digitalorientalist.comยท9h
๐Ÿ“œText Collation
Show HN: Rebuilt Bible search app to run 100% client-side with Transformers.js
biblos.appยท46mยท
Discuss: Hacker News
๐Ÿ“œBinary Philology
Announcing the 2025 NDSA Excellence Award Winners
ndsa.orgยท9h
๐Ÿ›๏ธPREMIS Metadata
From Documents to Dialogue: A step-by-step RAG Journey
dev.toยท8hยท
Discuss: DEV
๐Ÿ“ŠMulti-vector RAG
The Best Ways to Digitize Your Notes
lifehacker.comยท9h
๐Ÿ“„Document Digitization
The artificial complexity of OOXML files (the PPTX case)
blog.documentfoundation.orgยท5hยท
Discuss: Hacker News
๐Ÿ“ŸTerminal Typography
To MD - Convert PDFs, Word, HTML and more to Markdown
tomd.ioยท14hยท
Discuss: Hacker News
๐Ÿ”„Migration Tools
Welcome to LILโ€™s Data.gov Archive Search
lil.law.harvard.eduยท2h
๐Ÿ’พData Preservation
Efficient and accurate search in petabase-scale sequence repositories
nature.comยท2dยท
Discuss: Hacker News
๐Ÿ”„Burrows-Wheeler
Pecia system
rhollick.wordpress.comยท1d
๐Ÿ’งManuscript Watermarks
Take Note: Cyber-Risks With AI Notetakers
darkreading.comยท1d
๐ŸŽซKerberos Attacks
Extract speaker notes from PowerPoint to text
dri.esยท1d
๐Ÿ“œPalimpsest Analysis
IASC: Interactive Agentic System for ConLangs
arxiv.orgยท18h
๐ŸŒณContext free grammars
YouTube gets ~5% CTR lift on Shorts by replacing embedding tables with Semantic IDs
shaped.aiยท22h
๐Ÿ“ŠFeed Optimization
Offensive OSINT s05e10 - Interactive investigative stories part 1
offensiveosint.ioยท2d
๐ŸŒWARC Forensics
MetaGraph: Scalable annotated de Bruijn graphs for DNA indexing and alignment
github.comยท1dยท
Discuss: Hacker News
๐Ÿ”„Burrows-Wheeler